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Abstract 

An important problem that commonly arises in areas such as internet traffic- 
flow analysis, phylogenetics and electrical circuit design, is to find a represen- 
tation of any given metric Dona finite set by an edge-weighted graph, such 
that the total edge length of the graph is minimum over all such graphs. Such 
a graph is called an optimal realization and finding such realizations is known 
to be NP-hard. Recently S. Varone presented a heuristic greedy algorithm 
for computing optimal realizations. Here we present an alternative heuristic 
that exploits the relationship between realizations of the metric D and its 
so-called tight span Tq. The tight span Td is a canonical polytopal complex 
that can be associated to D, and our approach explores parts of T D for real- 
izations in a way that is similar to the classical simplex algorithm. We also 
provide computational results illustrating the performance of our approach 
for different types of metrics, including Zx-distances and two-decomposable 
metrics for which it is provably possible to find optimal realizations in their 
tight spans. 
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1. Introduction 



An important pr oblem that commo nly arises in are as such as internet 
traffi c-flow analysis ( iChung et al.l . 120011). phylogenetics (jBandelt and Dress! 
19921 ) and electrical circuit design ( iHakimi and Yaul . Il964j ) , is to realize any 
given metric D on some finite set X by an edge-weighted graph with X 
labeling its vertex set, often with the additional requirement that the total 
edge length of the graph is minimum. This can be useful, for example, 
for visualizing the metric, or for trying to better understand its structural 
properties. More formally this optimization problem can be stated as follows. 
A realization (G,cu, r) of D is a connected graph G = (V, E) with vertex set 
V and edge set E, together with an edge-weighting u : E — > M >0 and a 
labeling map r : X — > V such that, for all x, y G X, D(x, y) is the length of a 
shortest path from t(x) to r(y) in G (cf. Figure []Ja) and (b)). The problem 
then is to find an optimal realization of D, that is, a realization of D that 
has minimum total edge length over all possible realizations of D. 
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Figure 1: (a) A metric D on X = {a, b, c, d, e, /}. (b) A realization of (X, D) that is not 
optimal. Vertices associated with an element of X are drawn as black dots, the remaining 
vertices are drawn as empty circles. (c),(d) Two optimal realizations of (X, D). 



Earl y work on opti mal realizations started with ( jHakimi and Yaul . Il964j ) 
(see also IVarond ( 120061 ) for a comprehensive list of references), which focused 
mainly on special classes of metrics such as, for example, those that admit 
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an optimal realization where the underlying graph is a tree (so-called treelike 
metrics). Subseque ntly it was found that every met ric on a finite set has an 
optimal realization (Dress], 1984 ; Imrich et al. . 1984 ). although this need not 
be unique (cf. Figure [flc) an d (d)), and it was shown t hat c omputing an op- 
timal realization is NP-hard ( lAlthoferl . Il988l ; IWinklerl . Il984j ). More recently, 
there has been renewed interest in computati onal aspe cts of this p r oblem . 



For example, in lHertz and Varond (120071 l2008h (see also Dress et al. ( 2010[ )) 
a way to break up the problem of computing an optimal realiza t ion in to 
subproblems using so-called cut points is presented, and in IVarond (120061 ) a 
heuristic is presented for computing optimal realizations. 

Here we present an alternative heuristic for systematically computing op- 
timal realizations that exploits the relationship b etween optimal real i zation s 



of a metric D and its so-called tight span To (lDresa . Il984j ; llsbelll . |l964j). 
In brief (see Section [2] for details), To is a polytopal complex (essentially a 
union of polytopes) that can be canonically associated to D which is itself 
a (non-finite) metric space and into whi ch the metric D can be canonically 
embedded. Remarkably, in ( IDresa . Il984f ) it is shown that the 1-skeleton Go 
of T D (i.e., the edge- weighted graph formed essentially by taking all of the 
0- and 1- dimension al faces of T p) is always a realization of D. Moreover, 
Dress conjectured (IDressl . Il984j . (3.20)) that some optimal realization of D 
can always be obtained by removing some set of edges from Go- 

While Dress' conjecture is still open for metrics in general, recently it 
has been shown to hol d for the class of so-called two- decomposable metrics 
( [Herrmann et all l201ll . Theorem 1.2), a class which includes treelike metrics 
and Zi-distances between points in the plane (see Section [3] for more details). 
In particular, this and Dress' aforementioned result suggest that it could be 
useful to consider as a "search space" in which to look for some optimal 
realization of D (or at least some interesting realization of D which has 
relatively small total edge length). 

Guided by this principle, given an arbitrary finite metric D, in Section H] 
we propose a heuristic for computing a realization of D that is a subgraph 
of Go- This heuris tic explores pa rts of To in a way similar to the classical 
simplex algorithm (IDantzigl . Il963l ). Moreover, it does not explicitly compute 
Gn, whose vertex set can hav e cardinality that is exponential in \X\ (see e.g. 



Herrmann and Joswigi (120071 ) for some explicit bounds). We also show that 



the heuristic is guaranteed to find optimal realizations for some simple types 
of metrics. 

Since, as mentioned above, the problem of finding optimal realizations is 
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NP-hard, we assess the performance of our new heuristic using two strategies. 
First, we consider a special instance of the problem where we take metrics 
to be Zi-distances between points in the plane. In Section [5] we show that 
finding optimal realizations of such a metric D in Go is equivalent to the 
so-called minimum Ma nhattan ne t work problem (which was also recently 
shown to be NP-hard flChin et all 12009^ . This allows us to compare the 



realizations computed by our heuristic with realizations computed using a 
mixed integer linear pr ogram (MIP) f or the minimum Manhattan network 
proble m presented in ( Benkert et al. . 20061 ) (see also Knauer and Spillner 



( 1201 ll ) for a comprehensive list of references on other approaches for solving 
this well-studied problem). Second, in Section E] we describe a mixed integer 
program (MIP) for computing a minimal subrealization of a realization of 
some metric, that is, a subrealization with minimum total edge length. This 
allows us to obtain some impression of how close the realizations computed 
by our heuristic are to a minimal subrealization of Go in case \X\ is not 
too large. Moreover, in case the metric is two-decomposable, a minimal 
subrealization of Go is (by the aforementioned result) an optimal realization 
and so we can compare the realizations computed by our new heuristic with 
optimal ones for this special class of metrics. 

Based on these considerations, in Section [7] we present simulations for 
/i-distances, two-decomposable metrics and random metrics to assess the 
performance of our heuristic. An implementation of this heuristic is freely 
available for download at |www . uea . ac . uk/ cmp/ research/ cmpbio/ CoMRiT/ 1 
T his includes the alg orithm for efficiently computing cut points as described 
in lDress et al.l ( 20101 ) and auxiliary programs that allow to generate the MIP 



description for the minimum Manhattan network problem, as well as for the 
problem of computing a minimal subrealization so that they can be solved 
using existing MIP solvers (we used the solver that is part of the GNU linear 
programming kit (www.gnu.org/software/glpk/) in our experiments). We 
conclude the paper with a brief discussion of some possible future directions 
in Section [HI 



2. Preliminaries 



In this section, we first recall the formal definition of the tight span of a 
metric, a concept that h as been discovered and re-dis c overed severa l times 
i n the literature (see e.g. IChrobak and Larmord (119941 ); iDresd (119841 ); llsbell 
()1964j)). We also recall some facts concerning tight spans and optimal re- 
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alizations that will be used later on (for more on this see e.g. iDress et al 
(120121 . Chapter 5)). 



2.1. Some tight span theory 

A finite metric space is a pair (X, D) consisting of a finite non-empty set X 
and a symmetric bivariate map D:IxI-} R> such that D(x, x) = and 
D(x, z) < D(x, y) + D(y, z) hold for all x,y,z£X. A map h : X — > X' from 
a metric space (X, D) into a metric space (X', D') is an isometric embedding 
if D'(h(x), h(y)) = D(x, y) holds for all x, y G X. 

Now, given any finite metric spac e (X, D), the tight span T n i s defined 
to be the polytopal complex (see e.g. iKlee and Kleinschmidt fll999l n that is 
the union of the bounded faces of the polyhedron 

P D := {feR x : f{x) + f(y) > D(x,y) for all x, y G X}. 



Viewed subset of R x , T D can be endowed with the Zoo-metric which is 
defined by 

Doo(f,g) = max{|/(x) - g(x)\ : x G X} 

for all f,g eTd so that (Td,Doo) is also a (non-finite!) metric space. Note 
that there exists a canonical isometri c embedding of (X , D) into (T^jD^), 
the so-called Kuratowski embedding (IKuratowskil . 119351 ). that maps every 



x £ X to k x : X — yR: y t— y D(x, y). Note that the map k x is a 1-dimensional 
face (or vertex) of To for every x G X and, therefore, it is contained in the 
1-skeleton Go- 

Later we will use the fact that the tight span can be viewed as a hull of 
the given metric space similar to the convex hull associated to a set of points 
in Euclidean space. To make this more precise, define a map h : X — > X' 
from a metric space (X, D) into a metric space (X', D') to be non-expansive 
if D'(h(x), h{y)) < D(x, y) holds for all x, y G X, and a metric space (X', D') 
to be infective if for every metric space (X, D) and every subset Y C X any 
non-expansive map of the subspace (Y,D\ Y ) into (X',D') can be extended 
to a non-expansive map of ( X,D) into (X',D'). The t ight span satisfies the 
following universal property ( Dress . 19841 Isbell . 19641 ): 



Lemma 1. Any isometric embedding of a metric space (X,D) into an in- 
jective metric space (X', D') can be extended to an isometric embedding of 
(T D , D^) into (X',D'). 
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2.2. Tight spans and optimal realizations 

We now present a key relationship between realizations and tight spans 
that was first discovered by A. Dress. Let (X, D) be an arbitrary finite metric 
space. Defining the map uo by putting cu D ({u,v}) = D 00 (-u,f) for all edges 
\u, v } of Gd and the map td by putting Td(x) = k x , it is shown in ( IDressl . 
1984 T heorem 5) that (G p = (Vz>, Ed),ojd,td) is a rea lization of (X , D) 



(see also iDress et al.l (120121 Theorem 5.15)). Moreover, in IDressl ( 119841 ) it is 
shown that, for any optimal realization (G = (V,E),u,t)) of (X,D), there 
exists a map h : V — > Td such that, for any x G X, h(r(x)) = k x and, for 
any edge {u,v} G E, u({u,v}) = D 00 (/i( , u) , h(v)) hold. While this suggests 
that every optimal real ization (G = (V ,E),oj,t)) of (X,D) is "contained" 
in Td, it was shown by Althoferl ( 1988 ) that it might not be isomorphic to 
any sub-realization of (Gd = (Vb, Ed),ujd,t~d)- Still, as mentioned in the 
introduction, it is not known whether or not there always exists some optimal 
realization of (X, D) that is a sub-realization of (Gd = (Vd, E d ), 0Jd, Td)- 



3. Two-decomposable metrics 

In this section we shall consider a special class of finite metrics D, the 
two-decomposable metrics, for which it is known that Gd always contains 
a subrealization that is an optimal realization of D. As mentioned in the 
introduction, these metrics are of interest as we can in principle compute 
optimal realizations for them exactly and thus measure the accuracy of our 
heuristic for computing realizations for small metric spaces. 

We first need to recall some relevant concepts. A split S of a finite set 
X is a bipartition {A, B} of X into two non-empty subsets A and B, also 
denoted by A\B. For any x G X, that set in S that contains x is denoted 
by S(x) and the other set by S(x). Two splits A\B and A'\B' of X are 
compatible if at least one of the intersections A R A', A fl B', B n A' and 
B n B' is empty. Otherwise the two splits are incompatible. A set £ of splits 
of X is called a split system (on X). A split system £ is two- compatible if 
there is no subset S'CE with |E'| =3 and any two distinct splits in £' are 
incompatible. 

Now, for any split S of X, define the metric D$ on X putting, for all 
x,y G X, Ds(x,y) = if S(x) = S(y) holds and D(x,y) = 1 otherwise. 
A metric D on X is two-decomposable if there exists a two-compatible split 
system S on X and a weighting A : S — > !R>o with D = Xlses ^(S) ' As- We 
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Figure 2: (a) The tight span Tjj of the metric D in Figure[lja). It consists of four maximal 
2-dimensional faces surrounding the vertex k c , and three maximal 1-dimensional faces all 
of which have a vertex in common (and which form the "fork" in the figure), (b) The 
1-skeleton Gd of Tr>. (c) A weighted two-compatible split system that induces D. 



also say that D is induced by S and th e weighting A. Later we will use the 



following result ([Herrmann et al.l . 1201 ll . Theorem 1.2 



Theorem 2. Let D be a two- decomposable metric on X . Then there always 
exists an optimal realization that is a sub-realization of (Gd,ood,td) ■ In 
particular, there exists an optimal realization (G = (V,E),uj,t) of(X,D) 
such that there exists an injective map h : V T D with w({u,v}) = 
Doo(h(u), h(v)) for all edges {u, v} G E and h(r(x)) = k x for all x G X . 

We illustrate this theorem in Figure [2J More specifically, the metric D in 
Figure[]Ja) is two-decomposable, and its tight span is depicted in Figure[2](a). 
The realization Gd is pictured in Figure [2(b), and a two-compatible split 
system associated to D is given in Figure [2(c). Note that both of the optimal 
realizations for D given in Figure [2(c) and (d) can be obtained from Gd by 
removing precisely two edges. 

We now prove two simple but useful facts concerning the relationship 
between ^-distances between points in the plane, two-decomposable metrics 
and treelike metrics. For a point p G M 2 we denote by x(p) and y(p) the x- 
and y-coordinate of p, respectively, and the Zi-distance between two points 
p, q G M 2 by Di(p,q) = \x(p) — x(q)\ + \y(p) — y{q)\- Then we have: 

Lemma 3. Let P be a finite non-empty set of points in IR 2 . Then the metric 
DAp is 
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(i) two-decomposable. 



(ii) the sum of two treelike metrics. 

Proof, (i) Let E„ be the set of those splits A\B of P for which there exists a 
real number r such that A = {p G P : x(p) < r} and B = {p G P : > r}. 
Similarly, let E^ be the set of those splits A\B> of P for which there exits a 
real number r such that A = {p G P : y(p) < r} and B = {p € P : y(p) > r}. 
For every 5 G E„, put a(S') = min{a;(6) — x(a) : a G A, b G P} and, for every 
S G E ft put /3(5) = min{|/(6) — ?/(a) : a G ^4, 6 G P}. Note that any two 
splits in E„ as well as any two splits in E^ are compatible. Hence, the split 
system E := E t , U E^ is two-compatible. 

Now, define, for any split S in E, the weight 

a(S), if £ G E^ \ E^ 

A(.S') ^(5), if5GE h \^, 
a(S) + p(S), HSeZ h nz v . 

It is not hard to check that Di\p = Ylses^i^) ' holds, implying that 
Di\p is indeed two-decomposable. 

(ii) Continuing to use the notation introduced in the proof of (i), note 
that we have D\\p = D v + Dh with D v = ^ 5gE a(S) ■ D$ and Dh = 
SseEh ' Dg. Therefore, it remains to note that D v and Dh are treelike 
in view of the fact that a metric space (D',X') is treelike if there exists a 
system E' of pairwise c ompatible splits of X' and a map A' : E' — > R>o with 
jy = \'(S) ■ P g (iBunemarJ . Il97lh . □ 



4. Computing a realization in the tight span 

In this section we shall present our algorithm for computing realizations 
using the tight span. We also prove that it is guaranteed to work for some 
special types of metrics. Given a finite metric space (X, D), the basic idea 
of our algorithm is to select, for each pair {x, y} of distinct elements in X, 
a shortest path from k x to k y in Gp>. The union of these paths is then a 
realization of (X, D). This is summarized in the form of pseudo-code in 
Algorithm [TJ 

Pseudocode for the function f incLpath is presented in Algorithm [2j This 
function essentially computes, for any vertex u of Gp> and any x G X, a 
shortest path from u to k x in Gp>. To avoid computing the whole graph 
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Algorithm 1: The basic algorithm. 



Input: A finite metric space (X, D) 
Output: A realization of (X, D) 

1 Initialize the graph G = (V, E) with V = {k x : x G X}, E — 0; 

2 Form a list L of all pairs {x,y} G (*); 

3 foreach {x, y} G L do 
f ind_path(L>, fc x , y, G); 

/* Adds, if necessary, edges of Gd to G so that, after 
the call, G contains a path of length D(x,y) from k x 
to k y . */ 

5 end 

6 return (G — (V, E),uj d \ e ,t d ) 

Algorithm 2: Compute a path using the existing partial realization. 
Function: f ind_path(L>, u, x, G) 

1 Initialize v — u; 

2 if u is a vertex of G then 

3 Put M the set of vertices of G such that there is a path of length 
Doo(m, f ) from u to v in G and D^u, x) = D^u, v) + D^v, x); 
Put u to be a vertex in M with D^v, x) minimum; 

5 end 

6 else 

7 i Add u to G; 

8 end 

9 if v equals k x then 
10 | return ; 

n end 

12 Make a simplex step from v to arrive at vertex w; 

13 Add the edge {v, w} to G; 

14 f ind_path(£>, w, x, G); 



Gd, it constructs such a path edge by edge employing the polyhedron 
as follows. It computes in polynomial time from the description of Pp all 
vertices v of Gd that are adjacent to u in Gd- Among these vertices, one 
with Dooiu, k x ) = D^iu, v) + D OQ (v, k x ) that minimizes D^v, k x ) is selected. 
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We refer to this as a simplex step from u that arrives at vertex v, since this 



is sim ilar to one step in Dantzig's well-known simplex algorithm (iDantzig . 



19631 ) . 

To make use of the fact that certain edges of Gd might have been added 
to G in previous rounds of the foreach-loop in Algorithm [H the function 
f incLpath first explores whether the current graph G already contains edges 
that can serve as the initial part of a suitable path from u to k x . One 
would expect that the choice of the order in which pairs are processed in 
the foreach-loop has some impact on how many edges can be re-used in 
subsequent rounds. We found that ordering the pairs according to increasing 
distances between them tends to work well in practice. Then, in particular, 
for any elements x,y,z G X with D(x, y) + D(y, z) = D(x, z), no edges will 
be added when processing the pair {x, z}. 

Note that our algorithm is guaranteed to output an optimal realization 
for any treelike metric and any metric that corresponds to the shortest path 
distances between the pairs of vertices of a graph that is a cycle. T he for- 



mer f ollows from the fact that, for any treelike metric, Gd is a tree (jDressl . 



19841 ). and the latter is an immediate consequence of the fact that we pro- 
cess the pairs of elements in X according to increasing distances between 
th em. Moreover, using the deco mposition of metric realizations according 



to 



Hertz and Varond (120071 . 120081 ) as a preprocessing step, it follows that an 
optimal realization can be obtained for a given metric D if the decomposition 
of D yields only sub-instances for which our algorithm outputs an optimal 
realization. In particular, it follows that our alg orithm produ ces optimal 



realizations for all inputs given in the appendix of I Varond (120061 ) . 



5. Minimum Manhattan networks and optimal realizations 

In this section, using properties of the tight span, we give a concise proof 
of the fact that the problem of computing a minimum Manhattan network 
is nothing other than the problem of com puting an optim al realization for a 



special class of finite metric spaces (see also lEppsteinl (120111 ) for related work). 
This allows us to directly compare our heuristic for computing realizations 
with some existing algorithms for computing minimum Manhattan networks. 
Note that this fact seems to have not been pointed out before in the literature 
and has some interesting consequences for the computational complexity of 
constructing an optimal realization which we shall also point out. 
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Figure 3: (a) A Manhattan network for the set P of points drawn as black dots. Other 
vertices of the network are drawn as empty circles. The network is not minimum, (b) A 
minimum Manhattan network for P. 



To state the main result of this section, we first introduce some more 
notation. A Manhattan network (G = (V, E),u) consists of a finite graph G 
whose vertex set V C IR 2 is a set of points in the plane and a map u that 
assigns to each edge {p, q} G E as its length the /i-distance Di(p, q) between 
the points p and q. Note that, for every edge {p, q} G E, the straight line 
segment p^q with endpoints p and q is either horizontal or vertical and, for 
any two distinct edges e± = {pi, qi} and e 2 = {p 2 , q 2 } in E, the straight line 
segments pTTQi and pY, qi do not cross, that is, pTTOi H pY, <?2 Q &i H e 2 holds. 
For any path p from p to g in G, £(p) denotes the length of p, and p is 
monotone if D%{jp,q) = £(p) holds. 

Now, given a finite set of points P C M 2 , a Manhattan network for P is 
a Manhattan network (G = (V, E),u) with P C V such that for any two 
distinct p,q G P there exists a monotone path from p to g in C. Such a net- 
work is called minimum if its total length is minimum among all Manhattan 
networks for P (cf. Figure [3]). The minimum Manhattan network problem 
has been studied by several researchers over the last f ew years (for a com- 



prehen sive list of references for this problem see e.g. iKnauer and Spillner 



( 120111 )). We have the following relationship between minimum Manhattan 



networks and optimal realizations: 

Theorem 4. Let P be a finite non-empty set of points in M 2 . Then, for any 
minimum Manhattan network {G = (V,E),u) for P, {G = (V,E),u,id P ) is 
an optimal realization of (P, D\\p), where idp is the identity map on P. 

Proof. By definition, any Manhattan network for P is, up to adding the map 
idp, a realization of (P,D\\p). Hence, it suffices to show that there exists a 
Manhattan network for P whose total length is at most the total length of 
some optimal realization of (P, -Dijp). 



Consider an optimal realization (G = (V, E),u>, r) of (P, D i\p) such that 
there exists an injective map h : V — > T Dl \ p with w({u, v}) = D^hiu), h(v)) 
for all edges {u,v} G E and h(r(p)) = k p for all p G P. By Lemma [3] and 
Theorem [2l such an optimal realization always exists. 

Now, since the metric space (M 2 ,Di) is injective (see e.g. ICatusse et al 



(1201 ll ) ) . it follows that for every finite set P of points in M 2 there exists 
an isometric embedding of (T Dl \ p , D^) into (IR 2 ,!^) that maps every k p , 
p G P, to p. Therefore, there exists an injective map g : V — > M? with 
w({u,v}) = Dx(g(u), g(v )) for all edges {u,v} G E and g(r(p)) = p for all 
p G P. To obtain a Manhattan network for P, start with the points in g(V) 
and then add, step by step, for every {u, v} G E, edges to obtain a monotone 
path from g(u) to g(v ). Note that in the resulting Manhattan network M the 
length of a shortest path between g{u) and g(v) can be at most the length of 
a shortest path between u and v in G for all u, v G V. This implies that there 
is a monotone path from p to q in A/" for all p,q & P. Hence, A/" is indeed a 
Manhattan network for P. Finally, the total length of M is, by construction, 
not larger than the total length of G, as required. □ 

Before concluding this section, we point out some interesting implications 
of the last result: 

Corollary 5. Computing an optimal realization of a finite metric space 
(X, D) is NP-hard even if 

(i) D is two-decomposable, or 

(ii) D is the sum of two treelike metrics on X. 



Proof. In I Chin et al.l (120091 ) it is shown that computing (even just the total 
edge length of) a minimum Manhattan network is NP-hard. In view of 
Theorem HI this implies that computing an optimal realization of (P,Di\p) 
for a given point set P is NP-hard. By Lemma [3](i) the metric Di\p is two- 
decomposable. This establishes (i). Alternatively, this also follows from the 



NP-hardness proof in lAlthoferl (119881 ): It can be checked that the metric that 
arises from applying the reduction is always two-decomposable. 

In Lemma [3](ii) it was shown that Di\p is even the sum of two treelike 
metrics on P. This establishes (ii). □ 
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6. Finding minimal subrealizations 



In a similar spirit to finding optimal realizatio ns, there is a whole family o f 



so-called inverse shortest path problems (see e.g. ICui and Hochbauml (120101 ) 
and the references therein), where a minimum cost editing of a given graph 
is sought so that the shortest path distances between certain pairs of vertices 
equal given distances for those pairs. The problem of finding a minimal 
subrealization mentioned in the introduction can be viewed as yet another 
variant of this theme and we will briefly collect some facts about it in this 
section. 

First note that, in view of the fact th at the prob l em of computing a 



minimum Manhattan network is NP-hard ( I Chin et al.l . 120091 ) and the fact 



that there is always a minimum Manhattan network that i s conta ined in the 



grid induced by the given point set (see e.g. iBenkert et al.l ( 120061 )). we have: 



Proposition 6. The problem of computing a minimal sub-realization of a 
given realization (G,lo,t) is NP-hard even if G is a two-dimensional grid 
graph. 



Next note that, following a similar approach to the one used in lBenkert et al 



( 120061 ) for computing a minimum Manhattan network, one can phrase the 
problem of computing a minimal subrealization as a MIP. For the conve- 
nience of the reader, we include below the description of the MIP that we 
used for benchmarking in the computational experiments and that yields, 
for any given realization (G = (V,E),u,t) of a finite metric space (X,D), 
a subgraph G' = (V, E') of G with minimum total edge length such that 
(G", oj\e',t) is also a realization of (X, D): 

• For every edge {u, v} £ E, we introduce two directed edges (u, v) from 
u to v and (v, u) from v to u. Let E denote the set of these directed 
edges. 

• For every edge {u, v} £ E, we have a binary variable X{ UtV } indicating 
whether or not {u, v} is an edge of G'. 

• For any two distinct elements x, y £ X, we send one unit of flow from 
t(x) to r(y) that ensures that there is at least one path from x to y 
of length D(x, y) in G' . To describe this flow, we introduce, for every 
directed edge (u,v) £ E, a real- valued variable f(u,v){x,y}- 
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• For any two distinct elements x, y G X, the variables must satisfy the 
following constraints: 

(1) X{u,v} > f(u,v){x, y } > and x {U)V} > f( V)U ){x, y } > for all {u,v} G 
E. 

( 2 ) E«,{«,*}6b(/(«,«){».»} _ f(v,u){x,y}) = for all v G V \ {r(x),r(y)}. 

( 3 ) J2u,{u,v}eE(f(u,v){x,y} - f{v,u){x,y}) = ~1 for V = r(z). 

( 4 ) En.fvJgEf/lvKy} - f(v,u){x,y}) = Hoi V = T (y). 

(5) E {u,v}eE W {{ U i V }) ' if{u,v){x,y} + f(v,u){x,y}) — Dq(t(x) , T (?/))■ 

• The objective function is 

2j u}) • x {U)V} -> min . 

{u,v}£E 

In practice, we found that the size of the MIP can often be reduced consid- 
erably by only introducing the variable f( u ,v){x,y} for those edges {u, v} G E 
that actually lie on some shortest path from t(x) to r(y) in G. 

7. Computational Experiments 

To perform computational experiments, we have implemented the algo- 
rithm described in Section H] in C++ as an extension to the mathematical 



software system polymake f lGawrilow and Joswigi . l2000l ). In this implementa- 



tion, we apply, as a preprocessing step, the decomposition of a given instance 
by cut points. 

The experiments are designed to give an impression of the range of inputs 
that can be attacked by our algorithm in terms of size and also how close 
the realization produced by our algorithm is to an optimal realization. For 
each size n of the ground set of the metric space, 100 randomly generated 
inputs were considered and we present the mean run-time t of our algorithm 
(including the preprocessing) and the mean ratio r sg between the length of 
the realization produced by our algorithm and a minimal sub-realization of 
(Gd,0Jd,td) (if available). The variance of these values was usually quite 
low and is omitted. 

In the tables below, txs denotes the time to compute the whole tight span 
(if the size if the tight span admitted to compute it using polymake), t solve 
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n 


t 


^man 






^solvc 




5 


0.28 


0.24 


1.01 


0.01 


0.40 


0.93 


10 


0.65 


0.41 


1.15 


0.07 


2.23 


0.66 


15 


1.46 


0.70 


1.22 


4.11 


18.90 


0.55 


20 


3.07 


1.12 


1.27 


254.49 


386.28 


0.50 


25 


7.78 


1.47 


1.30 


15075.02 


7690.96 


0.46 


30 


11.49 


2.04 


1.34 


* 




* 


35 


21.25 


2.84 


1.37 


★ 


* 


* 


40 


37.99 


4.03 


1.39 


★ 


* 


* 


45 


64.61 


5.68 


1.41 


★ 




★ 


50 


105.42 


7.89 


1.42 


★ 


* 


★ 


55 


167.75 


11.27 


1.43 


★ 


* 


★ 


60 


256.51 


16.66 


1.44 


* 




* 


65 


379.84 


22.79 


1.45 


* 




* 


70 


555.02 


31.90 


1.47 


★ 




* 


75 


791.90 


43.60 


1.48 


* 






80 


1110.62 


61.06 


1.49 


★ 




★ 


85 


1838.51 


116.24 


1.50 


★ 




★ 


90 


2330.65 


631.08 


1.40 


* 


* 


* 



Table 1: Results of the computational experiments for instances of the minimum Manhat- 
tan network problem. 



denotes the time needed to solve the MIP described in Section [6] using the 
solver glpksol from the GNU linear programming kit, and r^s denotes the 
ratio of the length of the realization produced by our algorithm to the total 
edge length of the whole 1-skeleton of the tight span. A * indicates that 
the corresponding value could not be obtained because the 1-skeleton of the 
tight span was too large or at least too large to solve the resulting MIP. All 
run-times were taken on a Intel(R) Core(TM)2 Quad CPU 2.66GHz machine 
running CentOs 5.6 using only one core. 

7.1. Manhattan networks 

Inputs were generated by choosing n random points on an integer 10 6 x 10 6 
grid. In ad dition to the MIP des cribed in Section [61 we also used the MIP 
presented in iBenkert et al.l (120061 ) to compute an optimal realization for each 
input point set. The run time t man for solving this alternative MIP using 
glpksol is also given in Table [TJ As can be seen, for all instances, our 
realizations are usually within a factor of | of the optimum. Note that 
there exist several polynomial time algorithms that guarantee to produce 
a realization whose length is within a constant factor of the optimum 



currently, f or the best known alg orithms, the factor is 2 ( iChepoi et al. 
Quo et all l2008l : iNouioual . 120051 ). 



2008 



15 



n 


t 




^solvc 






5 


0.46 


0.01 


0.43 


1.02 


0.95 


10 


1.46 


0.07 


2.05 


1.10 


0.77 


15 


3.00 


3.49 


6.83 


1.16 


0.70 


20 


5.44 


225.32 


43.73 


1.19 


0.66 


25 


9.18 


13174.89 


314.37 


1.22 


0.63 


30 


12.87 


* 


* 




* 


35 


24.13 


* 


* 




* 


40 


38.62 


* 


* 




* 


45 


75.90 


* 


* 




* 


50 


114.40 


* 


* 


* 


* 


55 


169.91 


* 


* 


* 


* 


60 


250.89 


* 


* 




* 


65 


363.25 


* 


* 




* 


70 


506.94 


* 


* 


* 


* 


75 


587.90 


* 


* 




* 


80 


844.98 


* 


* 




* 


85 


1090.04 


* 






* 


90 


1319.21 




* 


* 


* 


100 


2143.58 


* 






* 



n 


t 


*ts 


^■solvc 


r sg 


rrs 


5 


0.45 


0.01 


0.49 


1.04 


0.81 


10 


1.06 


0.07 


2.00 


1.16 


0.67 


15 


2.29 


3.49 


9.42 


1.17 


0.59 


20 


21.05 


222.07 


83.64 


1.22 


0.58 



Table 2: Results of the computational experiments for metrics that are the sum of two 
treelike metrics (left) and general two-decomposable metrics (right). 

7.2. Two-decomposable metrics 

Recall that, in case the metric D is two-decomposable, we know that 
there exists an optimal realization that is a sub-realization of (Gd,uJd,td) 
(see Section [3]). Hence, r sg is actually the ratio between the length of the re- 
alization produced by our algorithm and the length of an optimal realization. 
We tested two types of two-decomposable metrics (cf. Table [2]): 

Metrics that are the sum of two treelike metrics: We choose two random 
binary trees with n leaves, took the set of these leaves as the ground set of 
the metric space and assigned uniformly distributed lengths (between 1 and 
10 6 ) to the edges of the trees. Then we formed the sum of the two treelike 
metrics realized by the binary trees. 

Metrics resulting from random two- compatible split systems: We gen- 
erated random two-compatible split systems of size In by generating ran- 
dom splits and adding them to an initially empty system if it remains two- 
compatible after adding the split. The metric considered in the experiment 
is the metric induced by the resulting split system where we again assigned 
uniformly distributed weights to the splits. 
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n 


t 




^solvc 




^TS 


5 


0.48 


0.01 


0.53 


1.04 


0.90 


6 


0.72 


0.01 


1.80 


1.06 


0.74 


7 


0.88 


0.02 


3.46 


1.10 


0.56 


8 


1.35 


0.04 


10.12 


1.15 


0.39 


9 


1.31 


0.12 


758.07 


1.20 


0.26 


10 


1.32 


0.35 


33652.08 


1.21 


0.18 


15 


3.52 


300.29 


* 


* 


0.01 


25 


15.97 




* 


* 


* 


30 


40.47 


■k 


* 




* 


35 


84.81 


* 


* 


-*- 


* 


40 


181.73 




* 


* 


* 


45 


330.98 


* 


* 




* 


50 


545.36 


* 


* 






55 


749.25 


* 


* 




* 


60 


1204.18 


■k 


* 




* 


65 


2081.53 


■k 


* 




* 



Table 3: Results of the computational experiments for general metrics. 

7.3. Random metrics 

Finally, we generated random metrics on n points by choosing each pair- 
wise distance uniformly between 10 6 and 2 ■ 10 6 . The results are presented in 
Table |3j Note that in this experiment it is not known whether (Gd,uid,td) 
contains an optimal realization of the given metric as a sub-realization. 
Therefore, the value r sg is only a lower bound on the ratio between the length 
of the realization produced by our algorithm and the length of an optimal 
realization. 

8. Discussion 

Our experiments indicate that, for most of the inputs considered, the 
length of the realization produced by our algorithm is within a factor of | 
of the length of a shortest sub-realization of (Gd,^d,td) or even within a 
factor of | of the length of an optimal realization. It could be interesting 
to investigate whether our algorithm (or a suitable variant of it) yields a 
constant-factor approximation algorithm, at least for certain classes of met- 
rics such as, for example, two-decomposable metrics. 

We also see that our algorithm can produce realizations for metric spaces 
with up to 50 elements, even in the case of general random metrics. Note also 
that all computations are done with arbitrary precision rationals/integers, to 
ensure combinatorial accuracy. Using floating point numbers instead (which 
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would make sense at least for the general random metrics, that is, generic 
metrics) could further speed up the computations. 

In future work it could also be interesting to try and develop an exact, 
exponential time algorithm for computing an optimal realization of any met- 
ric space. This would be helpful for benchmarking heuristics but would also 
allow to check Dress' conjecture for more examples. We expect that this 
could at least give some interesting further insights into the structure of the 
problem. 
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